{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 11-785 Recitation 0b: NumPy\n",
    "\n",
    "$\\hspace{5mm}$Omar Khattab - okhattab@cmu.edu"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- What is NumPy?\n",
    "- What is vectorization?\n",
    "- Why vectorization?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download and Installation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Follow instructions at https://scipy.org/install.html"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introductory Example: Wx + b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Naive Python Implementation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Inputs\n",
    "W = [[1, 1, 1],\n",
    "     [2, 2, 2],\n",
    "     [3, 3, 3]]\n",
    "\n",
    "x = [7, 8, 5]\n",
    "b = [1, 1, 1]\n",
    "\n",
    "# Output\n",
    "y = [0, 0, 0]\n",
    "\n",
    "# Naive Computation of Wx + b\n",
    "for i in range(len(W)):\n",
    "    for j in range(len(x)):\n",
    "        y[i] += W[i][j] * x[j]\n",
    "\n",
    "for i in range(len(y)):\n",
    "    y[i] += b[i]\n",
    "\n",
    "y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### NumPy-based Vectorized Implementation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "W = np.array([[1, 1, 1],\n",
    "              [2, 2, 2],\n",
    "              [3, 3, 3]])\n",
    "W"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = np.array([7, 8, 5])\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.ones(3)\n",
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "W.dot(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "W.dot(x) + b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Multi-Dimensional Arrays"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "T = np.random.random((2,3,4))\n",
    "T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "W = T[1]\n",
    "W.shape, W.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = np.full((4,3), 11785, dtype=np.int32)\n",
    "X.shape, X.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "W.dot(X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X2 = np.zeros(X.shape)\n",
    "\n",
    "np.array_equal(W.dot(X2), np.zeros((3,3)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Basic Element-wise Operations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y = np.arange(10)\n",
    "y, np.array(range(10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y == np.array(range(10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y + 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y * 10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(y + 1)**2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y + y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y / (y + 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.sqrt(y**2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y.min(), y.max(), y.std(), y.sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Indexing NumPy Arrays"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.save('tensor.npy', np.arange(2 * 4 * 4).reshape(2, 4, 4))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "T = np.load('tensor.npy')\n",
    "T,  T.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "T[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "T[1,0,0]  # more efficient version of T[1][0][0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "T[0,1:3,0:2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(T % 10) == 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "T[(T % 10) == 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: Masking and Clipping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "W = np.random.randint(0, 10, (4, 4))\n",
    "W"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mask = np.ones_like(W)\n",
    "mask[1] = 0\n",
    "mask"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "W * mask"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "W > 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "W2 = W\n",
    "W2[W2 > 5] = 5\n",
    "W2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "np.minimum(W, 5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Computation Along Axes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "A = np.random.random((3,3))\n",
    "A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "A.sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "A.sum(axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "A.sum(axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "A.max(axis=0), A.argmax(axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Combining Arrays: Stacking & Concatenation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([1, 1, 1, 1, 1]), array([2, 2, 2, 2, 2]), array([3, 3, 3, 3, 3]))"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.full(5, 1)\n",
    "b = np.full(5, 2)\n",
    "c = np.full(5, 3)\n",
    "a, b, c"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Stacking creates a new axis. It joins the supplied arrays, which must have the same shape, along that axis.\n",
    "Indexing a single element from that axis returns the appropriate input array."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([[1, 2, 3],\n",
       "        [1, 2, 3],\n",
       "        [1, 2, 3],\n",
       "        [1, 2, 3],\n",
       "        [1, 2, 3]]), (5, 3))"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "B = np.stack([a, b, c], axis=1)\n",
    "B, B.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2, 2, 2, 2, 2])"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "B[:,1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Concatenating arrays joins them along an *existing* aixs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[101, 102, 103, 201, 202, 203],\n",
       "       [101, 102, 103, 201, 202, 203],\n",
       "       [101, 102, 103, 201, 202, 203],\n",
       "       [101, 102, 103, 201, 202, 203],\n",
       "       [101, 102, 103, 201, 202, 203]])"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "C = np.concatenate([B+100, B+200], axis=1)\n",
    "C"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Transposing and Reshaping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([[1, 2, 3],\n",
       "        [1, 2, 3],\n",
       "        [1, 2, 3],\n",
       "        [1, 2, 3],\n",
       "        [1, 2, 3]]), (5, 3))"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "B, B.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([[1, 1, 1, 1, 1],\n",
       "        [2, 2, 2, 2, 2],\n",
       "        [3, 3, 3, 3, 3]]), (3, 5))"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "BT = B.T\n",
    "BT, BT.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([[1, 2, 3, 1, 2],\n",
       "        [3, 1, 2, 3, 1],\n",
       "        [2, 3, 1, 2, 3]]), (3, 5))"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "BR = B.reshape(3, 5)\n",
    "BR, BR.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Broadcasting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "A = np.random.random((8, 4))\n",
    "B = np.random.random((4, 3))\n",
    "A.dot(B)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.array([10, 20, 30])\n",
    "A.dot(B) + b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Arrays must have same ending dimensions to broadcast."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(np.arange(12).reshape((3,4)) + 0.01 * np.arange(4))\n",
    "print(np.arange(8).reshape((2,4)) + 0.01 * np.arange(24).reshape((3,2,4)))\n",
    "print(np.arange(6).reshape((3,1,2)) + 0.01 * np.arange(18).reshape((3,3,2)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Mismatched dimensions cause an error."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "  x = np.random.random((3,3,4))+np.random.random((3,2,4))\n",
    "except ValueError as e:\n",
    "  print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Saving and Loading"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Saving and loading a single NumPy array"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Save single array\n",
    "x = np.random.random((5,))\n",
    "print(x)\n",
    "\n",
    "np.save('tmp.npy', x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the array\n",
    "y = np.load('tmp.npy')\n",
    "\n",
    "print(y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Saving and loading a dictionary of NumPy arrays"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Save dictionary of arrays\n",
    "x1 = np.random.random((2,))\n",
    "y1 = np.random.random((3,))\n",
    "print(x1, y1)\n",
    "\n",
    "np.savez('tmp.npz', x=x1, y=y1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the dictionary of arrays\n",
    "data = np.load('tmp.npz')\n",
    "\n",
    "print(data['x'])\n",
    "print(data['y'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can find a more complete introduction at https://docs.scipy.org/doc/numpy/user/quickstart.html"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following links were helpful in preparing this notebook:\n",
    "\n",
    "- https://docs.scipy.org/doc/numpy/reference/index.html\n",
    "- https://github.com/cmudeeplearning11785/deep-learning-tutorials/blob/master/recitation-2/Tutorial-numpy.ipynb\n",
    "- http://cs231n.github.io/python-numpy-tutorial/#numpy"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
